Resumen:
The Linear Hinges Model (LHM) is an efficient approach to flexible and robust one-dimensional curve fitting under stringent high-noise conditions. However, it was initially designed to run in a single-core processor, accessing the whole input dataset. The surge in data volumes, coupled with the increase in parallel hardware architectures and specialised frameworks, has led to a growth in interest and a need for new algorithms able to deal with large-scale datasets and techniques to adapt traditional machine learning algorithms to this new paradigm. This paper presents several ensemble alternatives, based on model selection and combination, that allow for obtaining a continuous piecewise linear regression model from large-scale datasets using the learning algorithm of the LHM. Our empirical tests have proved that model combination outperforms model selection and that these methods can provide better results in terms of bias, variance, and execution time than the original algorithm executed over the entire dataset.
Resumen divulgativo:
Este artículo propone métodos de ensamblado para obtener un modelo de regresión lineal a tramos en un contexto de big data. Las pruebas demuestran que la combinación de modelos supera la selección de modelos, ofreciendo mejores resultados en términos de sesgo, varianza y tiempo de ejecución.
Palabras Clave: one-dimensional piecewise regression; non-linear regression; curve fitting; ensemble model; model selection; model combination; model parallelism
Índice de impacto JCR y cuartil WoS: 1,800 - Q2 (2023)
Referencia DOI: https://doi.org/10.3390/a17040147
Publicado en papel: Abril 2024.
Publicado on-line: Marzo 2024.
Cita:
S. Moreno, E.F. Sánchez-Úbeda, A piecewise linear regression model ensemble for large-scale curve fitting. Algorithms. Vol. 17, nº. 4, pp. 147-1 - 147-27, Abril 2024. [Online: Marzo 2024]